Hands-on Exercise 3b: Programming Animated Statistical Graphics with R

Author

Vanessa Heng

Published

January 24, 2024

Modified

March 9, 2024

1 Overview

In this exercise, you will learn how to create animated data visualisation by using gganimate and plotly r packages.

At the same time, you will also learn how to reshape data by using tidyr package, and process, wrangle and transform data by using dplyr package.

1.1 Basic concepts of animation

When creating animations, the plot does not actually move. Instead, many individual plots are built and then stitched together as movie frames, just like an old-school flip book or cartoon. Each frame is a different plot when conveying motion, which is built using some relevant subset of the aggregate data. The subset drives the flow of the animation when stitched back together.

1.2 Terminology

Before we dive into the steps for creating an animated statistical graph, it’s important to understand some of the key concepts and terminology related to this type of visualization.

  1. Frame: In an animated line graph, each frame represents a different point in time or a different category. When the frame changes, the data points on the graph are updated to reflect the new data.

  2. Animation Attributes: The animation attributes are the settings that control how the animation behaves. For example, you can specify the duration of each frame, the easing function used to transition between frames, and whether to start the animation from the current frame or from the beginning.

Tip

Before we start making animated graphs, we should first ask ourselves: Does it make sense to go through the effort?

If we are conducting an exploratory data analysis, an animated graphic may not be worth the time investment. However, if we are giving a presentation, a few well-placed animated graphics can help an audience connect with our topic remarkably better than static counterparts.

2 Getting Started

2.1 Loading the R Packages

We will use the following R packages in this exercise:

  • tidyverse, a family of modern R packages specially designed to support data science, analysis, and communication tasks including creating static statistical graphs.

  • gganimate, an ggplot extension for creating animated statistical graphs.

  • gifski converts video frames to GIF animations using pngquant’s fancy features for efficient cross-frame palettes and temporal dithering. It produces animated GIFs that use thousands of colors per frame.

  • gapminder: An excerpt of the data available at Gapminder.org. We just want to use its country_colors scheme.

The code chunk below will be used to accomplish the task.

pacman::p_load(readxl, gifski, gapminder,
               plotly, gganimate, tidyverse) 

2.2 Importing Data

For this exercise, we will use globalPop. It is in xls file format, hence we use read_xls function of the readxl package from tidyverse family.

col <- c("Country", "Continent")
globalPop <- read_xls("data/GlobalPopulation.xls",
                      sheet="Data") %>%
  mutate_at(col, as.factor) %>%
  mutate(Year = as.integer(Year))
Note

“Country” and “Continent” are reclassified as categorical data and “Year” is reclassified as numeric integer.

3 Animated Data Visualisation - gganimate methods

gganimate extends the grammar of graphics as implemented by ggplot2 to include the description of animation. It does this by providing a range of new grammar classes that can be added to the plot object in order to customise how it should change with time.

  • transition_*() defines how the data should be spread out and how it relates to itself across time.

  • view_*() defines how the positional scales should change along the animation.

  • shadow_*() defines how data from other points in time should be presented in the given point in time.

  • enter_*()/exit_*() defines how new data should appear and how old data should disappear during the course of the animation.

  • ease_aes() defines how different aesthetics should be eased during transitions.

Show the code
ggplot(globalPop, aes(x = Old, y = Young, 
                      size = Population, 
                      colour = Country)) +
  geom_point(alpha = 0.7, 
             show.legend = FALSE) +
  scale_colour_manual(values = country_colors) +
  scale_size(range = c(2, 12)) +
  labs(title = 'Year: {frame_time}', 
       x = '% Aged', 
       y = '% Young') 

transition_time() of gganimate is used to create transition through distinct states in time (i.e. Year).

ease_aes() is used to control easing of aesthetics. The default is linear. Other methods are: quadratic, cubic, quartic, quintic, sine, circular, exponential, elastic, back, and bounce.

Show the code
ggplot(globalPop, aes(x = Old, y = Young, 
                      size = Population, 
                      colour = Country)) +
  geom_point(alpha = 0.7, 
             show.legend = FALSE) +
  scale_colour_manual(values = country_colors) +
  scale_size(range = c(2, 12)) +
  labs(title = 'Year: {frame_time}', 
       x = '% Aged', 
       y = '% Young') +
  transition_time(Year) +       
  ease_aes('linear')  

4 Animated Data Visualisation: plotly method

In Plotly R package, both ggplotly() and plot_ly()support key frame animations through the frame argument/aesthetic. They also support an ids argument/aesthetic to ensure smooth transitions between objects with the same id (which helps facilitate object constancy).

Show the code
gg <- ggplot(globalPop, 
       aes(x = Old, 
           y = Young, 
           size = Population, 
           colour = Country)) +
  geom_point(aes(size = Population,
                 frame = Year),
             alpha = 0.7) +
  scale_colour_manual(values = country_colors) +
  scale_size(range = c(2, 12)) +
  labs(x = '% Aged', y = '% Young') + 
  theme(legend.position='none')

ggplotly(gg)
Show the code
bp <- globalPop %>%
  plot_ly(x = ~Old, 
          y = ~Young, 
          size = ~Population, 
          color = ~Continent,
          sizes = c(2, 100),
          frame = ~Year, 
          text = ~Country, 
          hoverinfo = "text",
          type = 'scatter',
          mode = 'markers'
          ) %>%
  layout(showlegend = FALSE)
bp